File Buddy Help
Topics | Regular expressions in File Buddy
Regular Expressions
File Buddy supports the use of regular expressions in the Perl 5 style for certain search operations. For complete documentation on the syntax supported by File Buddy, see http://www.newlisp.org/downloads/pcrepattern.html.
Basic pattern syntax
The simplest use of a regular expression is to match a literal string.
This regular expression matches the text shown in red
simple a simple file name.txt
file a simple file name.txt

The following are special characters:

\ [ ] { } ( ) $ ^ * . ? : - +

Special characters are not compared literally with characters in the text for matching purposes, but control matching in other ways. For example, a period (.) matches any character:
This regular expression matches the text shown in red
.m a simple text file name.txt
.m. a simple text file name.txt

Because they have special meanings, special characters must be escaped using a backslash (\) to match them:
This regular expression matches the text shown in red
(simple) a (simple) file name.txt
\(simple\) a (simple) file name.txt

To match a pattern only at the beginning of the subject string, start the pattern with a circumflex character (^).
This regular expression matches the text shown in red
a a simple text file name.txt
^a a simple text file name.txt

To match a pattern only at the end of the subject string, end the pattern with a dollar sign ($):
This regular expression matches the text shown in red
xt a simple text file name.txt
xt$ a simple text file name.txt
.x.$ AppleCuda.kext
To Do List.txt
URLMountUIProxy

To match only the entire string, use both (i.e. “^hello\.$”).
This regular expression matches the text shown in red
hello hello hello
^hello hello hello
hello$ hello hello
^hello$ hello hello
^hello$ hello

An unescaped period (.) matches any character.

Classes
To match a given set of characters, such as whitespace, only alphabetic characters, and only numeric characters, use a character class. A character class is a set of characters enclosed in brackets ([]). Any character in the brackets will match. Ranges are allowed and are interpreted in ASCII order. If a circumflex (^) is the first character, the class is negated, so that any character not in the brackets will match. There are also several special classes which match a predefined set of characters, and shortcuts for those classes. A partial list of common special classes and shortcuts is:
[:white:], \s, [ \t\n\r] The set of standard whitespace characters: space, tab, carriage return, and newline.
[:alpha:], [A-Za-z] Any alphabetic character.
[:alnum:], [A-Za-z0-9] Any alphanumeric character.
[:digit:], \d, [0-9] Any numeric character.

This regular expression matches the text shown in red
[A-Za-z0-9] March 16, 1955
[A-Z0-9] March 16, 1955
[aeiou] new catalog.doc
Repetition
To match a given character or class more than once, follow it with a repetition modifier. The modifiers are:
{n,m} Match the character between n and m times, inclusive.
{n} Match the character between n and m times, inclusive.
{n,} Match the character n or more times, inclusive.
{,n} Match the character between zero and n times, inclusive. Exactly the same as {0,n}
* Match the character zero or more times. Exactly the same as {0,}
? Match the character either zero or one times. Exactly the same as {0,1} or {,1}
+ Match the character one or more times. Exactly the same as {1,}
Subpatterns
To apply a modifier to more than one character, or to capture a substring for later use in replacement, use a subpattern. Any part of a regular expression enclosed in parenthesis is considered a subpattern. Subpatterns are numbered from left to right, starting from one. By default, all subpatterns are capturing, which means they count in the list of subpatterns and can be used for conditional evaluation and replacement. By using (?:) instead of (), a subpattern can be made non-capturing. Such a subpattern is used for grouping purposes only and is not numbered.

Replacing with backreferences
A subpattern that will be used used later is called a backreference. In a replacement string, “$n\” refers to the nth subpattern in the pattern string. Let’s look at an example of a replacement using backreferences in which we replace:

See (.+) run with (.+)

with:

Watch $2\ walk with $1\

 Original  After replacement
 See Jane run with Mary  See Mary walk with Jane
Examples
Consider the following expression:

^(H([[:alpha:]]{4}), (wor(?:l.)))$

This regular expression contains three numbered subpatterns. They are:

It will match all of the following strings (among many others):